Our goal in this section is to adjust or “wrangle” the data from each year into a common format so that we can combine the data sets across years for our analysis, and so that we have values in our variables that are correct and easy to interpret. We will need to understand what is the same and what is different across the data from different years, rename and recode the variables (e.g., by replacing the numbers 1 and 2 with the values “Male” and “Female” for the Sex variable), and combine the data. We will walk through these steps below.
First, let’s take a look at our data. We can get a good sense of it using the glimpse() function of the dplyr package.
Rows: 17,711
Columns: 29
$ psu <chr> "015438", "015438", "015438", "015438", "015438", "015438"…
$ finwgt <dbl> 216.7268, 324.9620, 324.9620, 397.1552, 264.8745, 264.8745…
$ stratum <chr> "BR3", "BR3", "BR3", "BR3", "BR3", "BR3", "BR3", "BR3", "B…
$ Qn1 <dbl> 10, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,…
$ Qn2 <dbl> 2, 1, 1, 1, 2, 2, 1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2…
$ Qn3 <dbl> 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 5, 5…
$ ECIGT <dbl> 2, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 2, 1, 2, 2, 1, 2, 1, 2, 1…
$ ECIGAR <dbl> 1, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2…
$ ESLT <dbl> 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ EELCIGT <dbl> 2, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 1…
$ EROLLCIGTS <dbl> 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2…
$ EFLAVCIGTS <dbl> 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ EBIDIS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ EFLAVCIGAR <dbl> 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 1, 2…
$ EHOOKAH <dbl> 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ EPIPE <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ ESNUS <dbl> 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ EDISSOLV <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CCIGT <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CCIGAR <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CSLT <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CELCIGT <dbl> 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CROLLCIGTS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CFLAVCIGTS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CBIDIS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CHOOKAH <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CPIPE <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CSNUS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ CDISSOLV <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
Updating the set of variables and their names
The easiest way of making it so that the data from the different years can be combined is by making sure the different data sets all contain the same variables that share the same names. In addition, giving the columns informative names will help make our code more readable. Currently, it isn’t very clear what most of the variables indicate since the variable names are uninformative on their own, without the code book.
We want to rename variables like Qn1 to something more meaningful like Age.
To do this we will use the rename() function of the dplyr package. The new name is always listed first before the =. This function will replace the old variable names with the new ones, i.e., after running the code below, there will no longer be a Qn1 variable in the data set, but there will be an Age variable instead. We will start working with the 2015 data, and then move on to the other years down below.
We also need to add new variables about usage of brands and specific flavors of e-cigarettes, because although later years have questions about this, this year unfortunately only has broad questions about flavors. Eventually we want to put the data from each year together, so we need the same variables for each year. Thus we need to add variables for brand and flavor to the 2015 data set that just contain missing values (NA).
To create these new variables, we will use the mutate function of the dplyr package. The mutate function can also be used to change the existing variables and create new variables from the old ones.
Now we can see how the data has changed:
Rows: 17,711
Columns: 38
$ psu <chr> "015438", "015438", "015438", "015438", "015438"…
$ finwgt <dbl> 216.7268, 324.9620, 324.9620, 397.1552, 264.8745…
$ stratum <chr> "BR3", "BR3", "BR3", "BR3", "BR3", "BR3", "BR3",…
$ Age <dbl> 10, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 1…
$ Sex <dbl> 2, 1, 1, 1, 2, 2, 1, 2, 1, 2, 2, 2, 1, 2, 2, 1, …
$ Grade <dbl> 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, …
$ ECIGT <dbl> 2, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 2, 1, 2, 2, 1, …
$ ECIGAR <dbl> 1, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ ESLT <dbl> 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, …
$ EELCIGT <dbl> 2, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 1, …
$ EROLLCIGTS <dbl> 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, …
$ EFLAVCIGTS <dbl> 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ EBIDIS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ EFLAVCIGAR <dbl> 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, …
$ EHOOKAH <dbl> 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, …
$ EPIPE <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ ESNUS <dbl> 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ EDISSOLV <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CCIGT <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CCIGAR <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CSLT <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CELCIGT <dbl> 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CROLLCIGTS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CFLAVCIGTS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CBIDIS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CHOOKAH <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CPIPE <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CSNUS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ CDISSOLV <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ brand_ecig <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ menthol <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ clove_spice <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ fruit <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ chocolate <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ alcoholic_drink <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ candy_dessert_sweets <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ other <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ no_use <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
The data for 2016-2018 have many common attributes, so we will want to write code that can be applied to all three data sets. To do this, we will use a function in R, which is basically a piece of code that can be applied to similar but different objects in R (e.g., the data tibbles from each of these three years). For more information on functions, see for example here.
These next 3 years have the same structure for many of the questions we are interested in. For example, they all have flavor questions, but not a brand question. Moreover, their variable names are consistent across the years; for each year, we want to replace the vague question name Q50A with the value menthol in all three data sets, and the same is true for the other flavor variables. For each of these years, we also want to create a variable brand_ecig that just contains NA values.
Since we want to perform the same modifications on the data from all three years, rather than repeating the same somewhat messy piece of code three times, we can do this more efficiently if we create a function to do all of these steps at once. Then we can use the map_at() function of the purrr package to apply the function we created (for renaming variables etc.) to the data from 2016-2018. By using vars() inside of the map_at() function we can specify what tibbles within our nyts_data list we want to include or exclude.
Update_survey <- function(x) { x %>%
rename(Age=Q1,
Sex=Q2,
Grade=Q3,
menthol=Q50A,
clove_spice=Q50B,
fruit=Q50C,
chocolate=Q50D,
alcoholic_drink=Q50E,
candy_dessert_sweets=Q50F,
other=Q50G,
no_use=Q50H) %>%
mutate(brand_ecig=NA)
}
#few options to apply to the data:
#nyts_data <-nyts_data %>% map_at(vars(nyts2016, nyts2017, nyts2018), Update_survey)
#nyts_data <-nyts_data %>% map_at(c("nyts2016", "nyts2017", "nyts2018"), Update_survey)
nyts_data <-nyts_data %>% map_at(vars(-nyts2015, -nyts2019), Update_survey)
The final year, 2019, has a slightly different data structure compared to these earlier data sets. For example, it actually has a brand_ecig variable already, but does not contain a no_use variable. So we can’t use the same function for this data set, but have to write an individual code chunk.
nyts_data[["nyts2019"]] <- nyts_data[["nyts2019"]] %>%
rename(brand_ecig=Q40,
Age=Q1,
Sex=Q2,
Grade=Q3,
menthol=Q62A,
clove_spice=Q62B,
fruit=Q62C,
chocolate=Q62D,
alcoholic_drink=Q62E,
candy_dessert_sweets=Q62F,
other=Q62G) %>%
mutate(no_use="missing")
Now let’s take a look at the variable names for each of the years using the map function from purrr.
It’s looking better! There are still some differences in the set of variables in the different years, but things are much closer.
Updating Values
Now that we have made some progress on the selection and names of the variables themselves, we will work on the values contained in the different variables.
We can start with updating the values for Age and Grade, so that they are more understandable.
Recall from the codebook for this year’s data set that Age isn’t listed in the way one might expect, i.e., it is not just a number of years, but a numerically valued categorical variable.

The same is true for Grade:

This is why it is so important to always check the codebook!!
We also want to replace the value of 19 for Age to be ">18" and the value of 13 for Grade to be replaced with "Ungraded/Other" Also according to the codebooks, numeric values of 1 indicate a survey answer of FALSE, while a value of 2 indicates TRUE. Sex needs to be recoded, but it is a bit unclear if it is encoded slightly differently across years.
Let’s create a function to make all these updates except for Sex. We will use the mutate function again, as well as mutate_all and recode to replace specific values of certain variables.
Update_values <- function(x) { x %>%
mutate(Age=as.numeric(Age)+8,
Grade=as.numeric(Grade)+5)%>%
mutate(Age=as.factor(Age),
Grade=as.factor(Grade),
) %>%
mutate_all(~ replace(., . %in% c("*", "**"), NA))%>%
mutate(Age=recode(Age,
`19` = ">18",
),
Grade=recode(Grade,
`13` = "Ungraded/Other")) %>%
mutate_at(vars(starts_with("E", ignore.case = FALSE),
starts_with("C", ignore.case = FALSE)),
list(~recode(.,
`1` = TRUE,
`2` = FALSE,
.default = NA,
.missing = NA)))
}
nyts_data <-nyts_data %>% map(.,Update_values)
Now let’s update the Sex encoding:
It’s a bit difficult to tell if the encoding was the same across years, so let’s check using the count() function of the dplyr package.
According to the codebook, we should have:
1) 8,958 males in 2015 2) 10,438 males in 2016 3) 8,881 males in 2017
4) 10,069 males in 2018
5) 9,803 males in 2019
$nyts2015
# A tibble: 3 x 2
Sex n
<dbl> <int>
1 1 8958
2 2 8622
3 NA 131
$nyts2016
# A tibble: 3 x 2
Sex n
<dbl> <int>
1 1 10438
2 2 10082
3 NA 155
$nyts2017
# A tibble: 3 x 2
Sex n
<dbl> <int>
1 1 8881
2 2 8815
3 NA 176
$nyts2018
# A tibble: 3 x 2
Sex n
<dbl> <int>
1 1 10069
2 2 9920
3 NA 200
$nyts2019
# A tibble: 3 x 2
Sex n
<chr> <int>
1 .N 116
2 1 9803
3 2 9099
Thus, it looks like males were encoded as 1 for each year.
- 8,958 males in 2015 where 1 = male
- 10,438 males in 2016 where 1 = male
- 8,881 males in 2017 where 1 = male
- 10,069 males in 2018 where 1 = male
- 9,803 males in 2019 where 1 = male
$nyts2015
# A tibble: 3 x 2
Sex n
<fct> <int>
1 male 8958
2 female 8622
3 <NA> 131
$nyts2016
# A tibble: 3 x 2
Sex n
<fct> <int>
1 male 10438
2 female 10082
3 <NA> 155
$nyts2017
# A tibble: 3 x 2
Sex n
<fct> <int>
1 male 8881
2 female 8815
3 <NA> 176
$nyts2018
# A tibble: 3 x 2
Sex n
<fct> <int>
1 male 10069
2 female 9920
3 <NA> 200
$nyts2019
# A tibble: 3 x 2
Sex n
<fct> <int>
1 .N 116
2 male 9803
3 female 9099
Looks good!
The years (2016-2019) that have flavors also need the flavor data to be logical (meaning TRUE or FALSE):
Now there are just a few changes needed that are specific to 2019:
nyts_data[["nyts2019"]] <- nyts_data[["nyts2019"]] %>%
mutate_all(~ replace(., . %in% c(".N",".S",".Z"), NA)) %>%
mutate(psu=as.character(psu)) %>%
mutate(brand_ecig = recode(brand_ecig,
`1` = "Other", #levels 1,8 combined to `Other`
`2` = "Blu",
`3` = "JUUL",
`4` = "Logic",
`5` = "MarkTen",
`6` = "NJOY",
`7` = "Vuse",
`8` = "Other"))
Great! Now we have all the same variables and our values don’t need to be handled any differently for any of the years. Thus we can combine the data across years.
Rows: 95,465
Columns: 41
$ year <chr> "nyts2015", "nyts2015", "nyts2015", "nyts2015", …
$ psu <chr> "015438", "015438", "015438", "015438", "015438"…
$ finwgt <dbl> 216.7268, 324.9620, 324.9620, 397.1552, 264.8745…
$ stratum <chr> "BR3", "BR3", "BR3", "BR3", "BR3", "BR3", "BR3",…
$ Age <fct> 18, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, …
$ Sex <fct> female, male, male, male, female, female, male, …
$ Grade <fct> 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, …
$ ECIGT <lgl> FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRU…
$ ECIGAR <lgl> TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FA…
$ ESLT <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, …
$ EELCIGT <lgl> FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRU…
$ EROLLCIGTS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, …
$ EFLAVCIGTS <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, F…
$ EBIDIS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ EFLAVCIGAR <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, F…
$ EHOOKAH <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ EPIPE <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ ESNUS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, …
$ EDISSOLV <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CCIGT <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, …
$ CCIGAR <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, …
$ CSLT <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CELCIGT <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, …
$ CROLLCIGTS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CFLAVCIGTS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CBIDIS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CHOOKAH <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CPIPE <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CSNUS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CDISSOLV <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ brand_ecig <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ menthol <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ clove_spice <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ fruit <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ chocolate <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ alcoholic_drink <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ candy_dessert_sweets <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ other <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ no_use <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ EHTP <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ CHTP <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
We will want to do some of our analysis split by year, so we would like to be sure we have one variable that has the correct value for year. It looks like we just need to remove "nyts" from the year variable that we created from the names of the tibbles in our list and we should be all set.
Here is our clean and wrangled data:
.{scrollable}
Rows: 95,465
Columns: 41
$ year <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, …
$ psu <chr> "015438", "015438", "015438", "015438", "015438"…
$ finwgt <dbl> 216.7268, 324.9620, 324.9620, 397.1552, 264.8745…
$ stratum <chr> "BR3", "BR3", "BR3", "BR3", "BR3", "BR3", "BR3",…
$ Age <fct> 18, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, …
$ Sex <fct> female, male, male, male, female, female, male, …
$ Grade <fct> 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, …
$ ECIGT <lgl> FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRU…
$ ECIGAR <lgl> TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FA…
$ ESLT <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, …
$ EELCIGT <lgl> FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRU…
$ EROLLCIGTS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, …
$ EFLAVCIGTS <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, F…
$ EBIDIS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ EFLAVCIGAR <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, F…
$ EHOOKAH <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ EPIPE <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ ESNUS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, …
$ EDISSOLV <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CCIGT <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, …
$ CCIGAR <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, …
$ CSLT <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CELCIGT <lgl> FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, …
$ CROLLCIGTS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CFLAVCIGTS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CBIDIS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CHOOKAH <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CPIPE <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CSNUS <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ CDISSOLV <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
$ brand_ecig <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ menthol <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ clove_spice <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ fruit <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ chocolate <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ alcoholic_drink <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ candy_dessert_sweets <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ other <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ no_use <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ EHTP <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ CHTP <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
Note that there are several variables where there are similar names, but with a C compared to an E in the variable name. Those starting with C are related to questions about current usage (last 30 days), while those with an E are related to usage across the subjects whole life (“ever” usage).
Reminder: Current users are a subset of ever users.